Stylometry with R: A Package for Computational Text Analysis

نویسندگان

  • Maciej Eder
  • Jan Rybicki
  • Mike Kestemont
چکیده

This software paper describes ‘Stylometry with R’ (stylo), a flexible R package for the highlevel analysis of writing style in stylometry. Stylometry (computational stylistics) is concerned with the quantitative study of writing style, e.g. authorship verification, an application which has considerable potential in forensic contexts, as well as historical research. In this paper we introduce the possibilities of stylo for computational text analysis, via a number of dummy case studies from English and French literature. We demonstrate how the package is particularly useful in the exploratory statistical analysis of texts, e.g. with respect to authorial writing style. Because stylo provides an attractive graphical user interface for high-level exploratory analyses, it is especially suited for an audience of novices, without programming skills (e.g. from the Digital Humanities). More experienced users can benefit from our implementation of a series of standard pipelines for text processing, as well as a number of similarity metrics.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Explanation in Computational Stylometry

Computational stylometry, as in authorship attribution or profiling, has a large potential for applications in diverse areas: literary science, forensics, language psychology, sociolinguistics, even medical diagnosis. Yet, many of the basic research questions of this field are not studied systematically or even at all. In this paper we will go into these problems, and suggest that a reinterpret...

متن کامل

Author Identification: Using Text Mining, Feature Engineering & Network Embedding

Authorship analysis is a challenging area that has been developed through centuries and with research done widely scattered across multiple disciples of mainly computational linguistics, text mining, data mining, stylometry and machine learning. Conventional techniques from the past relied heavily on stylometry and text-based content analysis of document text for authorship analysis. More recen...

متن کامل

Elastic constants and their variation by pressure in the cubic PbTiO3 compound using IRelast computational package within the density functional theory

p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; text-align: justify; font: 12.0px 'Times New Roman'} span.s1 {font: 12.0px 'B Nazanin'} p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; text-align: justify; font: 12.0px 'Times New Roman'} span.s1 {font: 12.0px 'B Nazanin'} In this paper, we study the structural and electronic properties of the cubic PbTiO3 compound by using the density functional the...

متن کامل

Detecting Style in Ancient Latin

Background Stylometry is a field that uses statistical and computational techniques to study the style of authors. Stylometry is used to address questions authenticity, authorship, and chronologies, among other questions. Most famously, Mosteller and Wallace used statistical techniques to determine the disputed authorship of the Federalist papers.1 More recently in 2015, stylometric techniques ...

متن کامل

The Impact of Oath Writing Style on Stylometric Features and Machine Learning Classifiers

Corresponding Author: Ahmad Alqurneh Faculty of Computer Science and Information Technology, Universiti Putra Malaysia, Serdang, Malaysia Email: [email protected] Abstract: Computational stylometry is the field that studies the distinctive style of a written text using computational tasks. The first task is how to define quantifiable measures in a text and the second is to classify the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015